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Abstract — The capacity of the Gaussian wiretap channel model 
is analyzed when there are multiple antennas at the sender, 
intended receiver and eavesdropper. The associated channel 
matrices are fixed and known to all the terminals. A computable 
characterization of the secrecy capacity is established as the 
saddle point solution to a minimax problem. The converse is 
based on a Sato-type argument used in other broadcast settings, 
and the coding theorem is based on Gaussian wiretap codebooks. 

At high signal-to-noise ratio (SNR), the secrecy capacity is 
shown to be attained by simultaneously diagonalizing the channel 
matrices via the generalized singular value decomposition, and 
independently coding across the resulting parallel channels. 
The associated capacity is expressed in terms of the corre- 
sponding generalized singular values. It is shown that a semi- 
blind "masked" multi-input multi-output (MIMO) transmission 
strategy that sends information along directions in which there is 
gain to the intended receiver, and synthetic noise along directions 
in which there is not, can be arbitrarily far from capacity in this 
regime. 

Necessary and sufficient conditions for the secrecy capacity to 
be zero are provided, which simplify in the limit of many antennas 
when the entries of the channel matrices are independent and 
identically distributed. The resulting scaling laws establish that to 
prevent secure communication, the eavesdropper needs 3 times as 
many antennas as the sender and intended receiver have jointly, 
and that the optimimum division of antennas between sender 
and intended receiver is in the ratio of 2 : 1. 

Index Terms — MIMO wiretap channel, secrecy capacity, cryp- 
tography, multiple antennas, broadcast channel. 

I. Introduction 

MULTIPLE antennas are a valuable resource in wireless 
communication. Over the last several years, there there 
has been extensive activity in exploring the design, analysis, 
and implementation of wireless systems with multiple an- 
tennas, emphasizing their role in improving robustness and 
throughput. In this work, we develop aspects of the emerging 
role of multiple antennas in providing communication security 
at the physical layer. 

The wiretap channel |[T) is an information-theoretic model 
for physical-layer security. In the model, there are three 
terminals — a sender, an intended receiver, and an eavesdrop- 
per. The goal is to exploit the structure of the underlying 
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broadcast channel to transmit a message reliably to the in- 
tended receiver, while leaking asymptotically no information 
to the eavesdropper. A single-letter characterization of the 
secrecy capacity when the underlying broadcast channel is 
discrete and memory less is developed in J2). An explicit 
solution for the scalar Gaussian case is obtained in 0, where 
the optimality of Gaussian codebooks is established. 

In this paper, we consider the case where there are multiple 
antennas at each of the three terminals, referring to it as 
the multi-input, multi-output, multi-eavesdropper (MIMOME) 
channel. In our model, the channel matrices are fixed and 
known to all three terminals. While the eavesdropper's channel 
being known to both the sender and the receiver in the problem 
formulation is a strong assumption, we remark in advance that 
the solution provides ultimate limits on secure transmission 
with multiple antennas, and thus serves as a starting point 
for other formulations. Further discussion of the modeling 
assumptions is provided in the companion paper [4] and the 
compound extension has been recently treated in [5|. 

The problem of evaluating the secrecy capacity of channels 
with multiple antennas has attracted increasing attention in 
recent years. As a starting point, for Gaussian models in which 
the channel matrices of intended receiver and eavesdropper 
are square and diagonal, the results in [|6|-|9), which consider 
secure transmission over fading channels, can be applied. In 
particular, for this special case of independent parallel Gaus- 
sian subchannels, it follows that using independent Gaussian 
wiretap codebooks across the subchannels achieves capacity. 

More generally, the MIMOME channel is a nondegraded 
broadcast channel to which the Csiszar-Korner capacity ex- 
pression (2) applies in principle. However, computing the 
capacity directly from (2) appears difficult, as observed in, 

e.g., Da-Da. 

To the best of our knowledge, the first computable upper 
bound for the secrecy capacity of the Gaussian multi-antenna 
wiretap channel appears in H, lfT4l . which is used to establish 
the secrecy capacity in the special (MISOME) case that the 
intended receiver has a single antenna. This approach involves 
revealing the output of the eavesdropper's channel to the 
legitimate receiver to create a fictitious degraded broadcast 
channel, and results in a minimax expression for the upper 
bound, analogous to the technique of Sato lfT51l used to 
upper bound the sum-capacity of the multi-antenna broadcast 
channel; see, e.g., [16|. 

In H, fl4l . this minimax upper bound is used to obtain 
a closed-form expression for the secrecy capacity in the 
MISOME case. In addition, a number of insights are developed 
into the behavior of the secrecy capacity. In the high signal- 



2 



to-noise ratio (SNR) regime, the simple masked beamforming 
scheme developed in ifTTI is shown to be near optimal. Also, 
the scaling behavior of the secrecy capacity in the limit of 
many antennas is studied. 

We note that this upper bounding approach has been inde- 
pendently conceived by Ulukus et al. [17 | and further applied 
to the case of two transmit antennas, two receive antennas, 
and a single eavesdropper antenna 1181 . Subsequently, this 
minimax upper bound was shown to be tight for the MIMOME 
case in Ifl9l and, independently, ||20| (see also |[2TlD . Both 
treatments start from the minimax upper bound of [4| and 
work with the optimality conditions to establish that the saddle 
value is achievable with the standard Gaussian wiretap code 
construction Q. 

In some of the most recent work, [22| provides an alterna- 
tive derivation of the MIMOME secrecy capacity using an ap- 
proach based on channel-enhancement techniques introduced 
in (23]. The two approaches shed complementary insights into 
the problem. The minimax upper bounding approach in |19|, 
[20 1 provides a computable characterization for the capacity 
expression and identifies a hidden convexity in optimizing the 
Csiszar-Korner expression with Gaussian inputs, whereas the 
channel enhancement approach does not. On the other hand the 
latter approach establishes the capacity given any covariance 
constraint on the input distribution, not just the sum-power 
constraint to which the minimax upper bounding approach has 
been limited. 

Finally, the diversity-multiplexing tradeoff of the multi- 
antenna wiretap channel has been recently studied in [24|. 

An outline of the paper is as follows. Section HI] summa- 
rizes some notational conventions for the paper. Section [Til] 
describes the basic channel and system model, as well as 
a canonical decomposition of the channel in terms of its 
generalized singular values, which is used in some of the 
asymptotic analysis. Section [IV] summarizes the main results 
of the paper, and Sections [VUVIII provide the correspond- 
ing analysis. In particular, Section [V] develops the minimax 
characterization of the secrecy capacity, Section [VI] develops 
the high SNR analysis in terms of the generalized singular 
values, and Section IVHI develops the conditions under which 
the secrecy capacity is zero in the limit of many antennas. 
Finally, Section IVIIII contains some concluding remarks. 

II. Notation 

In terms of fonts, bold upper and lower case characters are 
used for matrices and vectors, respectively. Random variables 
are distinguished from their realizations by the use of san- 
serif fonts for the former and regular serifed fonts for the 
latter. Sets are denoted using caligraphic fonts. We gener- 
ally reserve the symbols /(•) for mutual information, and 
h(-) for differential entropy, and all logarithms are base-2 
unless otherwise indicated. In addition, CN(0, K) denotes a 
circularly-symmetrix complex-valued Gaussian random vector 
with covariance matrix K. 

The set of all n-dimensional complex-valued vectors is 
denoted by C™, and the set of m x n-dimensional matrices 
is denoted using C mxn . In addition, I denotes the identity 



matrix and denotes the zero matrix. When the dimensions 
of these matrices is not clear from context, we will explicily 
indicate their size via subscripts; e.g., nxm denotes annxm 
zero matrix, 0„ denotes a vector of zeros of length n, and 
I„ denotes an n x n identity matrix. We further use the 
notation for j > i to denote the sub vector of its vector 
argument corresponding to indices i,i + 1, . . . ,j. Likewise, 
[■}i:j,k-.i denotes the submatrix formed from rows i through j 
and columns k through I of its matrix argument. 

Matrix transposition is denoted using the superscript T , the 
Hermitian (i.e., conjugate) transpose of a matrix is denoted 
using the superscript \ the Moore-Penrose pseudo-inverse 
is denoted by +, and the projection matrix onto the null 
space is denoted by In addition, Null(-), rank(-), and 
f max(') denote the null space, rank, and largest singular value, 
respectively, of their matrix arguments. Moreover, we say 
a matrix has full column-rank if its rank is equal to the 
number of columns, and the notation A >- means that A 
is positive definite, with A y likewise denoting positive 
semidefiniteness. 

In other notation, dim(-) denotes the dimension of its 
subspace argument, span(-) denotes the subspace spanned by 
the collection of vectors that are its argument, - 1 denotes 
the orthogonal complement of a subspace. Moreover, || • || 
denotes the usual Euclidean norm of a vector argument, tr(-) 
and det(-) denote the trace and determinant of a matrix, 
respectively, and diag(-) denotes a diagonal matrix whose 
diagonal elements are given by its argument. 

Finally, we use ==' and —4 to denote almost-sure equality 
and convergence, respectively, and additionally use standard 
order notation. Specifically, 0(e) and o(e) denote terms such 
that 0(e) /e < oo and o(e)/e — > 0, respectively, in the 
associated limit, so that, e.g., o(l) represents a vanishing term. 

III. Channel and System Model 

Using rit, n r , and n c to denote the number of antennas at 
the sender, intended receiver, and eavesdropper, respectively, 
the received signals at the intended receiver and eavesdropper 
in the channel model of interest are, respectively, 



y r (t) = H r x(t)+z r (i) 
y c (i) = H c x(t) + z c (t) 



t=l,2,... 



(1) 



where x(i) is the transmitted signal, where H r € C"^ 11 ' and 



H c e 



are complex channel gain matrices, and where 



z T (t) and z c (t) are each independent and identically distributed 
(i.i.d.) noises whose samples are CN(0, 1) random variables. 
The channel matrices are constant (over the transmission 
interval) and known to all the three terminals. Moreover, the 
channel input satisfies the power constraint 



E 



n 

-Eii x wn 

17 ^ ^ 



< P. 



A rate R is achievable if there exists a sequence of length 
n codes such that both the error probability at the intended 
receiver and 7(i/i/;y")/n approach zero as n — > oo. The 
secrecy capacity is the supremum of all achievable rates. 
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A. Channel Decomposition 

For some of our analysis, it will be convenient to exploit the 
generalized singular value decomposition (GSVD) [25 1, [26 1 
of the channel ([TJ. To develop this decomposition, we first 
define the subspaces 

§ r = Nul^Hr)- 1 n Null(H c ) (2a) 

S r , e = NullCHr) 1 - n NuU(He) 1 - (2b) 

So = Null(H r ) n Nul^Hc)- 1 (2c) 

S n = NuU(H r ) n NuU(H e ), (2d) 

corresponding to classes of inputs that have nonzero gain to, 
respectively, the intended receiver only, both intended receiver 
and eavesdropper, the eavesdropper only, and neither. Letting 



k = rank(H) 



with 



H 



H r 
H 



(3) 
(4) 



it follows that dim(S n ) = Ut — k. Moreover, we use the 
notation 



p = dim(Sr) and s = dim(§ rje ), 



(5) 



from which it follows that dim(S c ) = k — p — s. 

Using this notation, our channel decomposition is as fol- 
lows. 

Definition 1: The GSVD of (H r ,H c ) takes the form 



H r = 




[n 


1 Ofex 


(nt-fc) 




(6a) 


H c = 




[si 


- 1 o fcx 


(n t -fc) 


]*1 


(6b) 


where * r e C™ 1 




3 G 




and * t G C" tXnt are 


unitary, where SI 




is 


lower triangular and nonsingular, 


and where 
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are diagonal with 














D r = diag( 


ri,..., 


r.) 


, D c = 


= diag( 


ei,. 


■-,e s ), (8) 



the diagonal entries of which are real and strictly positive. The 
associated generalized singular values are 



<Ti = — , 1 = 1,2,.. 



,,s. 



(9) 



For convenience, we choose the (otherwise arbitrary) indexing 
so that a i < a% < ■ ■ ■ < a s . 

IV. Summary of Main Results 

In this section we summarize the main results in this paper. 
The analysis is provided in Sections IV14VIII 



A. MIMOME Secrecy Capacity 

A characterization of the secrecy capacity of the MIMOME 
channel is as follows. 

Theorem 1: The secrecy capacity of the MIMOME wiretap 
channel ((TJ is 



C = min max i?+(Kp,K#), 



where 



fl + (K P ,K*)=J(x;y r |y e ), 
with x ~ CN(0,K P ) and 

X P = {K P : K P h 0, tr(K P )<P}, 
and where 



eN(0,K*), 



withQ 



K* : K* = 



(10) 
(ID 
(12) 
(13) 

(14) 



Furthermore, the minimax problem of ( fTOb is convex-concave 
with saddle point solution (Kp,K*), via which the secrecy 
capacity can be expressed in the form 

n T ^ (i^^^^ det(I + H r K P Ht) 

C = R-(Kp) = log = 5-. (15) 

dct(I + H K P H|) 

Finally, C = if and only if 

H r = ©H c , (16) 

where 

= 0(K P ), 0(Kp) = 0(Kp, K»), (17) 

with 

0(K P , K*) = (H r KpH| + *)(I + HeKpHt)- 1 (18) 

denoting the coefficient in the linear minimum mean-square 
error (MMSE) estimate of y r from y , 

Several remarks are worthwhile. First, our result can be 
related to the Csiszai-Korner characterization of the secrecy 
capacity for a nondegraded discrete memoryless broadcast 
channel p yr . y< ,\ x in the form [2| 

C= max7( U ;y r )-7( U ;y e ), (19) 

Pu,P x \u 

where u is an auxiliary random variable (over some alphabet 
with bounded cardinality) that satisfies the Markov constraint 
u <-» x o (y r ,y c )- As |2) remarks, the secrecy capacity 
(fl9l l can be extended to incorporate continuous-valued inputs 
of the type of interest in the present paper. With such an 
extension, Theorem[T] and in particular (fTBI l, can be interpreted 
as (indirectly) establishing a suitable Gaussian wiretap code 
for achieving capacity]! Specifically, via the chain rule, 

Yr|ye) = [/(x; y r ) - /(x; y e )] + J(x; y e |y r ) 

'The constraint K$ ^ is equivalently expressed as the requirement that 
fmax(^) < 1, as we will exploit. 

2 Each candidate (u,x) in 1 1 9t corresponds to a particular coding scheme 
based on binning, which we generically refer as a "wiretap code," which 
achieves rate I(u;y T ) — I(u;y c ). 
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where the last term on the right-hand side is zero when <& = 
and thus we have the following immediate corollary. 

Corollary 1: The secrecy capacity of the MIMOME wire- 
tap channel is achieved by a wiretap coding scheme in which 
u - e3Nf(0, K P ) with K P = Kp, and x = u. 

From this perspective, our result can also be interpreted as 
a convex reformulation of the nonconvex optimization (fT9b - 
Indeed, even after knowing that both an optimizing u is 
Gaussian and x = u is sufficient — which itself is nontrivial — 
determining the optimal covariance via 

r> , det(I + H r KpH r t) 

K P e arg max log - \' (20) 

Kpexp det(I + H c KpHj) 

with %p as defined in ( TTZb . is a nonconvex problem^ And 
even if one verifies that Kp satisfies the Karush-Kuhn- 
Tucker (KKT) conditions associated with (f20b . these necessary 
conditions only establish local optimality, i.e., that Kp is 
a stationary point of the associated objective function. By 
contrast, ( fTob establishes that the (global) solution to (1201 
is obtained as the solution to a convex problem, as well as 
establishing the optimality of a Gaussian input distribution. 

Second, additional insights are obtained from the structure 
of the saddle point solution (Kp, K#). In particular, using *& 
to denote the optimal cross-covariance, i.e., [cf. ( TPfl il 



K* ^K # 



In, 



(21) 



we establish in the course of our development of Theorem [T] 
the following key property. 

Property!: The saddle point solution (Kp,K$) to the 
MIMOME wiretap channel capacity (fTOb satisfies 

tj^HrS = H C S, Vfull column-rank S s.t. SS f = Kp, (22) 

provided H r 7^ @H C (i.e., provided C 7^ 0). 
It follows from (l22l that the effective channel to the eaves- 
dropper is a degraded version of that to the intended receiver. 
Indeed, the intended receiver can simulate the eavesdropper 
channel by adding noise. Specifically, it generates 

y' c = * f y r + w, 

where the added noise w ~ CN(0,I — <fr^<I>) is independent 
of y r , so, using (Q]), d22l . and the notation x = Sx' with x' ~ 
eX(0,I), we have 

= &H T Sx' + *t Zr + w = Hc Sx' + z' c = H c x + z' c , 

where z' c ~ CK(0, 1). In essence, the optimal signal design for 
transmission is such that no information is transmitted along 
any direction where the eavesdropper observes a stronger 
signal than the legitimate receiver. A key consequence is 
that a genie-aided system in which y is provided to the 
receiver, which would otherwise provide only an upper bound 

3 Note that in the high-SNR regime, )20t reduces to 

det(H r KHj) 



max 



det(H c KHj) ' 



on capacity in general, does not increase the capacity of the 
channel in this case, a feature that is ultimately central to our 
analysis. 

Finally, the condition ( [ToT l corresponding to when the se- 
crecy capacity is zero has a natural physical interpretation. In 
particular, under this condition, the effective channel to the 
intended receiver is a degraded version of that to the eaves- 
dropper. Indeed, the eavesdropper can simulate the intended 
receiver by adding noise. Specifically, it generates 

Yr = ©Yc + W, 

where the added noise w ~ CX(0,I — ##^) is independent 
of y r , so, using (Q]i we have 

y' r = ©H c x + ©z r + w = H r x + z' r , 

where z' r ~ KN"(0, 1) since 

= * if H, = 0H C , (23) 

which follows from (TTTb with ([Tol l. 

B. Secrecy Capacity in the High-SNR Regime 

In the high-SNR limit (i.e., P — > 00), the secrecy capacity 
dTOb is naturally described in terms of the GS VD of the channel 
(Q]i as defined in (0. The GSVD simultaneously diagonalizes 
the H r and H c , yielding an equivalent parallel channel model 
for the problem. As such, a capacity-approaching scheme in 
the high-SNR regime involves using for transmission (with a 
wiretap code) only those subchannels for which the gain to 
the intended receiver is larger, and the following convenient 
expression for the capacity ( fTOb results. 

Theorem 2: Let g\ < 02 < . . . < a s be the generalized 
singular values of (H r ,H c ). Then as P — >• 00, the secrecy 
capacity of the MIMOME wiretap channel (T} takes the 
asymptotic form 

C(P)=C (P)+ V logo? -o(l), (24) 



E 

j : crj>l 



where 



log det(l + -H r H|Hj) , rank(H e ) < 
0, 



which is the well-studied multiple-discriminant function in multivariate statis- 
tics; see, e.g., (27]. 



Co(P) 

rank(He) = n t , 
(25) 

with p and s as given in (O, and with H| denoting the 
projection matrix onto Null(H c ). 

Note that a simple and intuitive transmission scheme for 
the MIMOME channel would involve simultaneously and 
isotropically transmitting information in Null(H r ) , where 
there is gain to the intended receiver, and (synthetic) noise 
in Null(H r ), which does not affect the intended receiver but 
does reduce the quality of the eavesdroppers received signalQ 
This "masked" multi-input, multi-output (MIMO) transmission 
scheme is the natural generalization of the masked beamform- 
ing proposed in [ 1 1 1 for the MISOME wiretap channel. For the 
MISOME channel, such an approach is near optimal, as shown 

4 Note that the scheme is semi-blind: the transmitter does not need to know 
H c to construct the required subspaces, but does need to know H c in order 
to choose the communication rate. 
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in (4) . However, we now show that such a masked multi-input 
multi-output (MIMO) scheme can be quite far from optimal 
on the MIMOME channel. 

For convenience, we restrict our attention to the case in 
which n r < n% < n c and H r and H are full rank — i.e., 
rank(H r ) = rt r and rank(H c ) = n% — and thus k — n t , p — 0, 
and s = n r in the GSVD. 

The masked MIMO scheme is naturally viewed as a wiretap 
coding scheme in which a particular (rather than optimal) 
choice for (x, u) is imposed in ( fT9l ). In particular, first we 
choose u to correspond to (information-bearing) codewords in 
a randomly generated codebook, i.e., 

u = (fai,...,fa n J, (26a) 

where the elements are generated in an i.i.d. manner according 
to eN(0,P t ) with 

P% = — ■ (26b) 

n t 

Additionally, we let fa„ r +i, . . . , b nt be randomly generated 
(synthetic) noise, i.e., independent CN(0, Pt) random vari- 
ables. 

Next, we choose the transmission x according to 

n t 

X = H^ V J' (26C) 

where the vectors vi, . . . , v„ t are chosen as follows. Let 

H r = UAVj (27) 

be the compact singular value decomposition (SVD) of H r . 
Since rank(H r ) = n r , this means that U is n r x n r and unitary, 
A is n r x n r and diagonal with positive diagonal elements, 
and V r is n t x n r with orthogonal columns. Then we choose 
Vi, . . . , v„ r in ( 126cl i as the columns of V r , i.e., 

V r = [vi v 2 • • • v„ r ] , 

and (freely) choose 

V n 4 [v„ r+1 • • • v„ t ] , (28) 

a basis for the null space of H r , so that [V r V n ] is unitary. 

As we will establish, substituting these parameters in the 
argument of (fT~9b yields the achievable rate 

R SN (P) = logdet [(P t I + A" 2 ) (H r (I + PtHtHe)- 1 ^)] , 

(29) 

which in the high-SNR regime reduces to 

lim R SN (P) = logdet^^HtHe)- 1 ^) = Vloga 2 , 

3=1 

(30) 

where the second equality comes from expanding H r and H c 
via (O, with oi, ^2, ■ ■ ■ denoting the generalized singular val- 
ues (0. Comparing (130b and d24t . we see that the asymptotic 
gap to capacity is 

lim [C(P) - R SN (P)j = ]T lo S'A, 

j:a 3 <l 3 

which, evidently, can be arbitrarily large when there are small 
singular values. 



In concluding this section, we emphasize that only in 
the high-SNR regime do the generalized singular values of 
(H r , H c ) completely characterize the capacity-achieving and 
masked MIMO coding schemes. 

C. MIMOME Channel Scaling Laws 

By using sufficiently many antennas, the eavesdropper can 
drive to secrecy capacity to zero. In such a regime, the 
eavesdropper would be able to decode a nonvanishing fraction 
of any sent message — even when the sender and receiver 
fully exploit knowledge of H c . In general, this threshold 
depends on the numbers of antennas at the transmitter and 
intended receiver, as well as on the particular channels to 
intended receiver and eavesdropper. One characterization of 
this threshold is given by (fTST l in Theorem Q] An equivalent 
characterization that is more useful in the development of 
scaling laws, is as follows. 

Claim 1: The secrecy capacity of the MIMOME channel is 
zero if and only if 

<7 max (H r ,H e )4 sup &4<1- (31) 

vGC'H ||W- V| 

where er max (H r , H c ) denotes the channel's largest generalized 
singular value. 

When the coefficients of the channels are drawn at random, 
and the numbers of antennas are large, the threshold becomes 
independent of the channel realization. The following result 
characterizes this scaling behavior. 

Corollary 2: Suppose that H r and H c have i.i.d. CN(0, 1) 
entries that are fixed for the entire period of transmission, and 
known to all the terminals. Then when n r ,n c ,n t — > oo such 
that 7 = n r /n c and (3 = n,t/n c are fixed constants, the secrecy 
capacity satisfies C(H r , H ) —4 if and only if 

0</3<I and 7 <(l-v/2^) 2 . (32) 

Fig. Q] depicts the zero-capacity region (132b . In this plot, 
the solid curve describes the relative number of antennas an 
eavesdropper needs to prevent secure communication, as a 
function of the antenna resources available at the transmitter 
and intended receiver. The related scaling law developed for 
the MISOME case [4| corresponds to the vertical intercept of 
this plot: C ^> when (3 < 1/2, i.e., when the eavesdropper 
has at least twice the number of antennas as the sender. 
Note, too, that the single transmit antenna (SIMOME) case 
corresponds to the horizontal intercept; in this case we see 
that C -^-4 when 7 < 1, i.e., when the eavesdropper has 
more antennas than the intended receiver. 

We can further use such scaling analysis to determine 
the best asymptotic allocation of a (large) fixed number of 
antennas T between transmitter and intended receiver in the 
presence of an an eavesdropper. In particular, the optimum 
allocation is 

(/3*,7*)= argmin (/3 + 7 )=f|Ij, (33) 

{(/3, 7 ): 0</3<l/2, ^ ' 

0< 7 <(1-V2?) 2 } 
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Fig. 1 . The efficient frontier of secure communication region as a function of 
the number of antennas at the transmitter and intended receiver (relative to 
the number at the eavesdropper), in the limit of many antennas. The capacity 
is zero for any point below the curve, i.e., whenever the eavesdropper has 
sufficiently many antennas. 



c 

CO 
CO 

c 
c 

£ 

< 

CD 
Q_ 
Q- 
O 

"6 

CO 
CD 

a 




Receive-to- Transmit Antenna Ratio n /n 



Fig. 2. The minimum (relative) number of eavesdropper antennas required 
to drive the secrecy capacity to zero, as a function of the antenna allocation 
between transmitter and intended receiver, in the limit of many antennas. 



as is easily verified. Thus, the allocation that best thwarts the 
eavesdropper is n r /n t = 1/ 2, which requires the eavesdropper 
to use 3T antennas to prevent secure communication. 

It is worth remarking that the objective function in d33T > 
is rather insensitive to deviations from the optimal antenna 
allocation, as Fig. [2] demonstrates. If fact, even if we were 
to allocate equal numbers of antennas to the sender and the 
receiver, the eavesdropper would still need (3/2 + \/2)T w 
2.9142 T antennas to drive the secrecy capacity to zero. 

V. MIMOME Secrecy Capacity Analysis 

In this section we prove Theorem[T] Our proof involves two 
main parts. We first recognize the right-hand side of ( TTOb as an 
upper bound on the secrecy capacity, then exploit properties 




Saddle Point: (K P ,K#) 







Kp £ argmax R+CKp, K*) 






> 


r 


Kp £ argmax/i(y r — ©y c ) 







7?+(Kp,K*) =R-(K P ) 



Fig. 3. Key steps in the proof of Theorem [T] First, the existence of a saddle 
point (Kp, K$) is established, then the KKT conditions associated with the 
minimax expressions are used to simplify the saddle value to show that it 
matches the lower bound. 



of the saddle point solution to establish 

i?+(K P ,K*) =i?_(K P ), 



(34) 



where i?_(Kp) is the lower bound (achievable rate) given in 

CG]>. 

We begin by stating our upper bound, which is a trivial 
generalization of that established in 0. 

Lemma 1 ( [4 J): An upper bound on the secrecy capacity 
of the MIMOME channel ([B is given by 



C(P) < i?+(K P ,K # ) = min max i?+(K P ,K # ) 

K#6Xi K P e3Cp 



where 



R+(K P ,K*) 4j( X ;y r |y e ), 



(35) 
(36) 



with x - eX(0,Kp), and z - CK(0,K # ), and the domain 
sets 3Cp and are defined via (fT2l and (fT4l respectively. 

It remains to establish that this upper bound expression 
satisfies d34l . which we do in the remainder of this section. 
We divide the proof into several steps, as depicted in Fig. [3] 

Furthermore, we remark in advance that the analysis 
throughout is slightly simpler when K# >- 0. Accordingly, in 
the following sections we focus on this nonsingular case and 
defer analysis for the singular case to appendices as it arises 
in our development. The key to analysis of the singular case 
is replacing the observations y r with reduced but equivalent 
observations. In particular, we will make use of the following 
claim, a proof of which is provided in Appendix U 

Claim 2: Let the singular value decomposition of $ be 
expressed the form 



* = [Ui U 2 ] 



I 












A 




kl 



t (A) < 1. (37) 



Then if p x is such that /(x; y r |y c ) < oo, we have 

7(x;y r |y e )=7(x;y r |y e ), (38) 

where 

y r 4 l4y r = H r x + z r (39) 
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with 

H r = U^H r and z r = U 2 z r . (40) 
Symmetrically, if p x is such that 7(x; y c |y r ) < oo, we have 
/(x;y e |y r )=7(x;y e |y r ), (41) 

where 

Ye = V^y c = H c x + z c (42) 

with 

H C ^V 2 H C and z c 4 V 2 z c . (43) 

Finally, for any p x we have that 7(x; y r |y c ) = oo if and only 
if 

TK P T f 0, (44) 
where Kp is the covariance associated with p x , and where 
T 4 ulH r - V\H C . (45) 
Note that when ( f38l > holds, the equivalent model holds and 
* = E[z x z\] = u3,* (46) 
is the equivalent noise cross-covariance. 

A. Existence of a Saddle Point Solution 

We first show that the minimax upper bound is a convex- 
concave problem with a (finite) saddle point solution. 

Lemma 2: The upper bound (l35T l has a saddle point solu- 
tion, i.e., there exists (Kp,K$) € OCp x UC^, such that 

i?+(K P ,K # ) < i?+(Kp,K*) < i?+(K P ,K # ) (47) 

holds for each (Kp,K$) € 3Cp x 3C$. Moreover, the saddle 
value is finite, i.e., 

i?+(Kp,K*) < oo. (48) 

Proof: Since the constraint sets %p and 3C^. are convex 
and compact, from a special case of Sion's minimax theorem 
[28 1 it suffices to show that 

i?+(Kp, ■) is convex on for each Kp e %p (PI) 
-/?+(•, K#) is concave on "Kp for each 6 3C# (P2) 

To first establish (IPlb . we begin by writing 

/(x;y r |y e )=/(x;y r ,y e )-J(x;y e ), (49) 

and observe that the second term in d49l is fixed for each 
e 3C*. Thus it suffices to show that with x ~ CN(0, Kp), 
the first term in d49l is convex in K#. This is established in, 
e.g., [29, Lemma II-3, p. 3076]. 

We next establish (1P21 >. With slight abuse of notation, we 
define R + (p x ,K.^>) = /(x;y r |y c ) with x ~ p x and z ~ 
CN(0, K*). By contrast, our original notation R + (Q, K^) 
corresponds to the special case of R + (p x ,T£$>) in which 
Px = SK(0,Q). Let p° = KNT(0, Q ), pj = KN(0,Qi), 



p» = 6»pi + (1 - 6»)p°, and Q e = (1 - 0)Q O + 0Q l5 for 
some 9 e [0, 1]. Then the required concavity follows from 

R + (Q e ,K*) = i? + (e^(0,Q e ),K*) 

>i?+( Px 9 ,K*) (50) 
> (l-9)R + ( P ° x ,Ks,) + 6R + ( P lK*) (51) 
= (1 - 0)ii+(Qo, K*) + ftR+(Qi,K»), 

where ( T50b follows from the fact that a Gaussian distribution 
maximizes i? + (p x ,K^,) among all distributions with a given 
covariance, which we discuss below, and where ( Bil l follows 
from the fact that 7(x;y r |y c ) is concave in p x for each fixed 
Py r ,y e |x; see, e.g., ||8] Appendix I]. 

Verifying < T50b is straightforward when K# is nonsingular, 
i.e., || <&|| 2 < 1. Specifically, with 

A(Kp,K*) 

^I + H r K P Hj 

- (* + H r KpHl;)(I + H e KpH|) _1 (*^ + H c KpHj) 

(52) 

denoting the error covariance associated with the linear MMSE 
estimate 0(Kp, K$)y c of y r from y c , a simple generalization 
of (4] Lemma 2] yields 

J(x; yr|y c ) = Myr|y c ) - h(z t \z e ) (53) 

= ^(yr|y c ) - logdet7re(I - <&3> t ) 
< logdet A(K P ,K # ) - logdet(I - 

(54) 

where the last inequality is satisfied with equality if p x = 
GN(0, Kp). When K# is singular, ( 1531 is not well-defined, 
so some straightforward modifications to the approach are 
required; these we detail in Appendix ITT1 

Finally, to verify (l48l . it suffices to note that 

i?+(Kp, K») < i?+(K P , I) < J(x; y e , y r ) < oo 

where the second inequality follows from the chain rule 
^( x ;yr|y c ) = /(x;y e ,y r ) - /(x;y e ), and where the last 
inequality follows from the fact that cov(z) = I. ■ 

B. Property of the Saddle Point 

To simplify evaluation of the associated saddle value, we 
now develop the Property Q] For notational convenience, we 
define A via [cf. (l52l l 

A 4 A(Kp), A(Kp) 4 A(Kp, K«). (55) 

The required property is obtained by combining the follow- 
ing two lemmas. 

Lemma 3: A saddle point solution (Kp,K§) to d35l > sat- 
isfies 

(H r - ©H c )Kp(#t Hr - H e ) t = (56) 

Lemma 4: A saddle point solution (Kp,K#) to (f35T > is 
such that 

(H r - 0H c )S has a full column-rank (57) 
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provided H r 7^ 0H O , where S is a full column-rank matrix 
such that SS t = K P . 

In particular, combining d56l > and d57l > we immediately 
obtain (l22l >. since for a full column-rank matrix M, Ma = 
if and only if a = 0. 

In the remainder of the section, we prove the two lemmas. 
Proof of Lemma [3} Here we consider the simpler case 
when K3. >- 0; the extension of the proof to the case when 
is singular is provided in Appendix [TTTJ 

We begin by noting that the second inequality in ( l4Tb 
implies 

K* e argmini? + (Kp, K*). (58) 

The Lagrangian associated with the minimization ( 1581 is 

£*(K*,Y) = i?+(K P ,K*) + tr(TK»), (59) 
where the dual variable 



Yi 
Y 2 



(60) 



is a block diagonal matrix corresponding to the constraint that 
the noise covariance K3. must have identity matrices on its 
diagonal. The associated KKT conditions yield 

V K *£#(K*,Y)| Kfl , =K# 

= V K .i?+(K P ,K # )| Ki!>=I ^+Y = 0. (61) 

Substituting 

V K *i? + (Kp,K # )| K ^ 
= V K$ [logdet(K # +HK F Ht)-logdet(K*)]| K4>=K4) 
= (K $ + HKpH , r 1 -K $ 1 , (62) 

with (H]i into doTt and simplifying, we obtain, 

HKpH* = K^Y^* + HKptf). (63) 

To complete the proof requires a straightforward manipula- 
tion of d63l to obtain d56l >. Specifically, substituting for K3, 
from (Bil l and H from (0]l into (l63l . and carrying out the 
associated block matrix multiplication yields 

H r K P Hj = Yi(I + H r KpHj) + *Y 2 (* t + H c K P Hj) 

(64) 

H r KpHj = T x (* + H r K P Hj) + *Y 2 (I + H c KpH^) 

(65) 

H K P Hj = *t Tl(I + H r K P Hj) + Y 2 (*t + H c K P Hj) 

(66) 

H c KpHj = *t Tl (^ + H r K P H|) + Y 2 (I + H c KpH^). 

(67) 

Eliminating Yi from (l64l and (|66T >. we obtain 
($t Hr -H c )KpHj = (^^-^Ya^t+HeKpHt), (68) 

and eliminating Yi from d65l l and ( |67] i, we obtain 
(*t Hr -H c )KpHt = (*t§_i)T 2 (i + He KpHt). (69) 



Finally, eliminating Y 2 from (|68l and (|69l , we obtain 
(*t Hr - H c )K P Ht 

= (*t Hl . - H e )KpHt(I + HeKpHt)- 1 ^ + H e KpHt) 
= (* , H 1 -H c )K P Ht0 t , (70) 



which reduces to (1561 1 as desired. ■ 
In preparation for proving Lemma @] we establish the 
following key proposition, whose proof is provided in Ap- 
pendix Irvl 

Proposition 1: When x - 6^(0, K P ) and z - CN(0, K#) 
with K$ >- in the model ([T), we hav^l 

argmax/i(y r - 0(K P )y c ) = argmax/i(y r - 0y c ), (71) 

where and 0(Kp) are as defined in ( fT7b with ( fT8b . 

Proof of Lemma^ Again, here we consider the simpler 
case when K$ is nonsingular; a proof for the case when K# 
is singular is provided in Appendix [V] 
We begin by noting that 



Kp € arg max i? + (Kp , K*) 

K P G3Cp 

= argmax/i(y r |y c ) 

Kpe3Cp 

= arg max h(y T — 0(Kp)y c 

K P e3Cp 

= argmax/i(y r — 0y c ) 

KpElp 



(72) 
(73) 
(74) 
(75) 



= arg max log det (I + H off KpHj ff ), (76) 

K P e3Cp 

where d72l follows from the first inequality in d47| i, where 
d73l ) follows from the fact that K$ >- 0, where (O follows 
from Proposition Q] and where in d76"b we have the effective 
channel 

H off ^ J-Va(H r -eHe), (77a) 



with 



J = I + 00 t -0* t -*0 t 
= (I - + (0 - *)(0 - 



(77b) 



which is nonsingular since K3. >- 0. 

Finally, because J >- 0, showing ( fSTb is equivalent to 
showing that that H c ffS has full column-rank, which we 
establish in the sequel to conclude the proof. First, we express 
H e g in terms of its singular value decomposition 



H c ff — AS c ffB 1 ', 
i.e., A and B are unitary matrices, and 



Jeff 



So 









(78) 



(79) 



where v = rank(H fj) > and So is diagonal with strictly 
positive entries. We establish that H e g S has full column-rank 

5 Note that the maximum on the left-hand side is in general a lower bound 
on the maximum on the right-hand side. 

6 As an aside, note that (76) provides the interpretation of Kp as an optimal 
input covariance for a MIMO channel with matrix H c ff and unit-variance 
white Gaussian noise. 
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by showing that the columns of S are spanned by the first v 
columns of B, i.e., 





V 


nt—v 




' F 














F = B^KpB = v *° " (80) 



for some Fo >z 0. 

To this end, substituting d78l > into d76l l. we obtain 

K P = argmaxlogdet(I + AS cff B t K P BE^A t ) 

= argmaxlogdet(I + ScffB^pBS^), . (81) 

Now K P G X P if and only if F = B^KpB G % P , so (|8T) 
implies that 

F G argmaxlogdet(I + S eff FS| ff ) 

FeXp 

= argmaxlogdet(I + S F S ), (82) 

FG.TC P 

with F expressed in terms of the block notation 



(83) 



and where d82| i follows from d79l >. 

Finally, it follows that Fi and F 2 , the Fi and F 2 in 
(1831 when F = F, are both 0. Indeed, if F 2 ^ 0, then 
tr(F 2 ) > 0. This would contradict the optimality in fl82| i: since 
the objective function only depends on Fo, one could strictly 
increase the objective function by increasing the trace of Fo 
and decreasing the trace of F 2 . Finally, since F y and 
F 2 = 0, it follows that Fi = 0. ■ 



C. Evaluation of the Saddle Value: Proof of Theorem Q] 

The conditions in Lemmas [3] and [4] can be used in turn to 
establish the tightness of the upper bound ( 1351 1. 

Lemma 5: The saddle value i?+(Kp,K$) in ( 1351 ) can be 
expressed as 





V 


n\—v 


V 


' Fo 


Fi 


n t — y 


M 


F 2 



i?+(K P ,K*) 



R-(K P ), H r ^0H C 
0, otherwise, 



(84) 



where R_(Kp) is as given in < fT3T >- 

The proof of Theorem [T] is a direct consequence of 
Lemma [5] If i?+(Kp,K$) = 0, the capacity is zero, oth- 
erwise fl + (Kp,K$) = i?_(Kp), and the latter expression 
is an achievable rate as can be seen by setting p u = p x = 
e^(0,K P ) in the argument of GUl. 

Thus, to conclude the section it remains only to prove our 
lemma. 

Proof of Lemma [5} Here we consider the case when 
when K$ >- 0, i.e., ||$|| 2 < 1; the proof for the case when 
K$ is singular is provided in Appendix I VII 



To obtain (|84l when H r ^ @H C , we begin by writing the 
gap between upper and lower bounds as 

i?+(Kp,K*)-i?_(Kp) 

= J(x; Yr|y e ) - [/(x; y r ) - /(x; y e )] 

= /(x;y e |y r ) (85) 

= Myc|y r ) - h(z e \z T ), 

then note that this gap is zero since 

% e |y r ) 

= log det 7reA b (86) 
= logdet7re(I + H c K p h£ - &(I + H r KpHj)<&) (87) 
= logdet7re(I- (88) 
= ^(z e |zr), 
where in 

A b = 
I + H c KpHt 

- (*t + H c K P Hj)(I + H r KpHt)- x (i. + H r K P Ht)) 

(89) 

is the "backward" error covariance associated with the linear 
MMSE estimate of y c from y r , and where to obtain each of 
( f87b and (188) we have used d2p of Property \T\ 
To obtain d84l when H r = ©H , we note that 

i?+(Kp,K # ) =/(x;y r |y c ) (90) 

= Myr|y c ) - h(2 r \z c ) 

= h(y r - 0y c ) - h(z x - *z ) (91) 

= h(z, - 0z c ) - h(z r - *z ) (92) 

= 0, (93) 

where d9Tb follows from the fact that in ( fT7b is the 
coefficient in the MMSE estimate of y r from y c , and $ 
is the coefficient in the MMSE estimate of z r from z c , 
where d9"2i l follows via the relation H r = 0H C , so that 
y r — 0y c = z r — 0z c , and where d93l follows from d23l . 

■ 

VI. Capacity Analysis in the High-SNR Regime 

We begin with a convenient upper bound that is used in 
our converse argument, then exploit the GSVD in developing 
the coding scheme for our achievability argument. Our high- 
SNR capacity results follow, and separately consider the cases 
where H c does and does not have full column-rank. 

Lemma 6: For all choices of G C"' x,lt and * G C"' x "<= 
such that ||*|| 2 < 1, the secrecy capacity (l35l l of the channel 
(HJ is upper bounded by 

C(P)< max i? ++ (K P ,0,*), (94a) 

K P eXp 

where 

i? ++ (K P ,0,*) 

= h(y T - 0y o ) - logdet7re(I - 

det(HK P Ht + i + 0©t 0$t $©t) 



log- 



det(I- **t) 



(94b) 
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with 



H = H r 0H„ 



(94c) 



Proof: First note that the objective function 
i?+(Kp,K$) in (fTTb can be expressed in the form 



R+(K P ,K*) 



^(x;y r |y e ) 

/l(yr|y C ) - M Z i'l Z c) 

^(yr|y c ) — logdet 7re(I — *$^) 
min/i(y r - ©y c ) — logdet 7re(I — 

rmni? ++ (K P ,0,*), (95) 



Hence, 



minmaxi? + (Kp, K*) (96) 
= min max min i? ++ (Kp, 0, <fr) (97) 
< minminmaxi? ++ (Kp,0,*) (98) 
= min min max i? ++ (Kp, 0, $), 

*:||#||,<1 © K P eXp 

(99) 

where to obtain ( |96i > we have used Q51 >. where to obtain <|97]> 
we have used (1951 , and where to obtain d98l we have used that 
a minimax quantity upper bounds a corresponding maximin 
quantity. 

Finally, we further upper bound (|99l by making arbitrary 
choices for and <£, yielding d94l , ■ 

A. GSVD Properties 

The following properties of the GS VD in Definition [T] are 
useful in our analysis. 

First, the GSVD simultaneously diagonalizes the channels 
in our model ([T). In particular, applying © we obtain 



y r (t) = £ r x(t)+z r (i) 
%{t) = Eex(t) + z e (i), 



(100) 



where 



5L = 



k—p- 



k—p—s 


s 


p 







D r 













I 






fc- 


-p— s 


s 


p 






I 















D c 






and 



x(t) 

*.(*) 

z r (t) 

2e(*) 



r*Jy»(t)l 

[*&.(*)] l!lHp 

[*J z r(0], ir _ p _ s+1 . Ilr 



The corresponding equivalent channel is as depicted in 

Fig. a 




Fig. 4. Equivalent parallel channel model obtained via GSVD. 



Second, the GSVD yields a characterization of the null 
space of H c . In particular, 



Null(H e ) = § r U § n , 



(101) 



where, expressing ^f t as defined in (|6]l in terms of its columns 
i/>i, i = 1, ...,n t , viz., 



*t - [^i 
we have [cf. ||2a), <(2d>] 



S r =span(-0 fe _ p+1 ,...,-0 fc ) 
§ n = sp&n(ip k+1 , ...,ip nt ). 



(102a) 
(102b) 



We first verify ( 1102b . To establish (I102bb . it suffices to note 
that 

H T ipj = H e ^ =0, j = k + 1, . . . , n t , 

which can be readily verified from ||6). 

To establish ( I102al i. we show for all j € {k — p + 1, . . . , k} 
that H i/' J = and that the {Hr^-} are linearly independent. 
It suffices to show that the last p columns of S r il _1 are 
linearly independent and the last p columns of £ e ri _1 are 
zero. To this end, note that since O -1 in © is a lower 
triangular matrix, it can be expressed in the form 



k—p—s 



sr 



k—p—s 

t 21 

T 3 i 



"a 
T32 



p 





(103) 



By direct block left-multiplication of ( 1103t with ( TTab and jTbl , 

we have 



(104a) 





fc— p— s 


5 


p 


k—p—s 











SrO" 1 = s 


D r T 21 







p 


T31 


T32 


sV 




k—p—s 


s 


p 


k—p—s 







" 




D c T 2 i 


DcfJ^ 1 





p 












(104b) 



Since fi 3 is invertible (since J~2 is nonsingular), the last p 
columns of S r Jl _1 are linearly independent and the last p 
columns of S c f2 -1 are zero, establishing ( 1 1 02 al > . 
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To characterize Null(H c ), we use ( 1102a| ) and dl02b| > with 
dlOll ) to obtain 

Null(He) = span(V> k _p+i 5 ■ • ■ , ifr nt ), 
from which we obtain that 



is the projection matrix onto Null(H c ), where 

*nc = [lPk-p+1 ■■■ V\i t ] • 

In turn, using d 1 06t and dl04at in (|6ab we obtain 

V n t —k 



(105) 



(106) 



J -- L r ^ne — ^ r 



n r —p 
P 





fig 1 



whence 



H r H»Ht = * r " r 



n r —p 
P 



n F —p p 

o o 

o nj 1 ^ 



(107) 

Third, the GSVD can be more simply described when the 
matrix H c has a full column-rank. To see this, first note from 
43) and 45) that 



fc = n t and p = 0, 
respectively, and thus ((6) specializes to 

*t Hr * t n = E r , * H c * t f2 = S c 
with [cf. 47)] 



(108) 



(109a) 



nt — s s 


D r 





nt — s 


s 


TH—S 


I 





s 





Do 


n e — nt 









(109b) 

and D r and D c as in 41). Hence, it follows from d 1 09b that 

}H (no) 



nt — s s n e — nt 

ioo 

o D: 1 o 



satisfies H|H C = I and thus is the Moore-Penrose pseudo- 
inverse of H c . Finally, from ( 1109t and d 1 1 0b we obtain 



H r H* = * r 



it — s s 7i&— nt 



DjD" 1 



from which we see that the generalized singular values of (H r , 
H e ) in 4D are also the (ordinary) singular values of H r H|. 

We now turn to our secrecy capacity analysis in the high- 
SNR regime. There are two cases, which we consider sepa- 
rately. 



B. Case I: rank(H c ) = n t 

In this case, we use that d 1081 ) holds and so the GSVD is 
given by ( 11091 ), and thus dim S r e = s, dim § e = n t — s, and 
dimS r = dimS n = 0. 

Achiev ability: In the equivalent parallel channel model of 
Fig- El there are s subchannels that go to the intended receiver 
(and also to the eavesdropper, with different gains), which 
correspond to S r c . Of these s subchannels, we use only 
the subset for which the gains to the intended receiver are 
stronger than those to the eavesdropper, and with these our 
communication scheme uses Gaussian wiretap codebooks. 

In particular, we transmit 

°" u = [0, . . . ,0, u v , u v+ i, ■ ■ ■ , u s \, 

(HI) 

where v is the smallest integer such that <jj > 1, and where 
the nonzero elements of u are i.i.d. C?\f(0, aP) with a = 
l/(ntcr max (fi)) so that the transmitted power is at most P. 

Using dl 1 U and ( 1109t in 4D> the observations at the 
intended receiver and eavesdropper, respectively, take the form 

0, 



x = & t n 



'n-t — s 
U 



'nt — s 

D r u 



'nt — s 

D e u 

Or; „ — n,. 



In turn, via 4l9) , the (secrecy) rate achievable with this system 



R = I(u;y r ) - J(u;y e ) 



= E lo si 



1 + aPr] 
aPel 



= log^-o(l) 

j:<Tj>l 

as required. 

Converse: It suffices to use Lemma [6] with the choices 



= H r H*, $ = * r 



where 



nt—s s n e —nt 

OHO 



^ e ' 

'(H2a) 



H = diag(^i,C 2 ,---,6), & = miri I a i} — j , (112b) 

and where H| is the pseudo-inverse defined in (11 101 ). With 
these choices of parameters, (I94ct evaluates to H = 0, so we 
can ignore the maximization over Kp in ( 194al i. Simplifying 
( |94) for our choice of parameters yields 

det(I + (D.D- 1 ) 2 - 2D 1 .D" 1 S) 



R++ < log 



det(I - H 2 ^ 



= E ^ 

j:<Jj>\ 

which establishes our result 



(113) 



C. Case II: rank(H ) < n t 

In this case, we use the general form of the GSVD as given 
by 45), so now dimS r = p > and dimS r . = s > 0. 
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Achiev ability: In the equivalent parallel channel model of 
Fig. |U there are p subchannels that go only to the intended 
receiver, corresponding to S r , and s subchannels that go to 
both the intended receiver and eavesdropper (with different 
gains), corresponding to S r c . Our communication scheme 
uses both sets of subchannels independently with Gaussian 
(wiretap) codebooks. 

In particular, we transmit 



x = * t 



f2 2 u 

V 

Ont-fe 



(114) 



where v and u are the length -p and length- s auxiliary random 
vectors associated with communication over § r and S r o , 
respectively. The elements of v are i.i.d. CN(0, (P— yP)/p), 
corresponding to allocating power P — \fP to S r . For S r o , 
we use only the subset of channels for which the gains to the 
intended receiver are stronger than those to the eavesdropper, 
sou= [0, . . . , 0, u„, . . . , Ug} 1 -, where v is the smallest integer 
such that <7j > 1, and where the nonzero elements are i.i.d. 
S!N(0,aVP), independent of v, with a = l/(ntc ma x(^2)) 
so that the power allocated to S r c is at most \/P. 

With x as in dl 14b . the observations at the intended receiver 
and eavesdropper, respectively, take the form 



y c = *c 



0n r — p— s 

D r u 

T 32 n~ 1 u + fiz 



Ok-p-s 

D c u 

0n e +p-k 



(115a) 



(115b) 



Via dT9b , the system dl 15b achieves (secrecy) rate 

R = J(u,v;y r ) - 7(u,v;y c ) 

= J(u;y r ) - /(u;y e ) + J(v;y r |u), 



(116) 



where dl 16t follows from the fact that v is independent of 
(y e , u), as dl 15bb reflects. 



Evaluating the terms in dl 161 ), we obtain 

1 



I(u-,yr) -I(u;y e ) 



5> r 



3 = v 

E 



a\[Pe 



logCT?-o(l), 



(117) 



and 




p -VP 



p 

-H r H»Hj 

p 



-0(1) 



(118) 
(119) 



where dl 181 ) follows from the continuity of logdet(-), and 
where (11191 ) follows from d!07t . Substituting d 1 171 ) and ( 11191 ) 
into (11 161 ) yields our desired result. ■ 



Converse: To establish the converse, we use Lemma [6] with 
the choices 



= * t ^ 





k—s—p 


s 


n e -\-p—k 


n T —s—p 











s 





DrD- 1 





P 


F31 


F32 






and 



* = *r < 





k—s—p 


s 


n c -\-p—k 


n r — s—p 











s 










V 












c 

' (120) 
*t (121) 



where S is as defined in (11 12bb . and where we choose 

F32 = T32Jl2D c 1 

F31 = (T31 — F32D e T2i)r2i 
with T 2 i, T31 and T 32 as defined in dl031 >. so that 

H r - 0H C 



-1 




<nt — 




p-s 


s 


p 


nt — k 
































^3 


1 



>*t f 



The upper bound expression (194t can now be simplified as 
follows. 



HK P H t = (H r 0H c )Kp(H r 0H C ) 



n T —p—s 



n T —p—s s 

















n^Qn^ 



(122) 



where Q is related to Kp via 



k—p 
nt — k 



k—p p nt — fe 

Q 



and satisfies tr(Q) < P. From ( H22l . (fT2Tb and JT201 . we have 
that the numerator in the right-hand side of ( 194-bb simplifies 
to d!231 ) at the top of the next page. 

In turn, using d!231 > and the Fischer inequality (which 
generalizes Hadamard's inequality) for positive semidefinite 
matrices ll30l . we obtain 

logdet(I + H©^ + 00 f 0* f *© f ) 
< logdet(I + (D.D- 1 ) 2 - 2D r D- 1 H) 

+ logdet(I + F 3 iF 31 + F 32 F 32 + x QSl^), 



13 



HSH 1 + I + 00 t - 0^ - *0 f 



n r — s—p 



I 








(D r D"Y - 
F 32 (D r D~ 



2D r D~ 1 !i 





(D r D 1 - S)F\ 2 
I + F 31 F^ + F 32 F\ 2 + n^QQ 3 f 



>*l (123) 



which when used with d94l ) yields 

C(P) 

det(I + (D.D- 1 ) 2 - 2D r D" 1 H) 



< log- 



det(I - H 2 



+ max logdct(I + F 3 iFj 1 +F 3 2Fj 2 + J7 3 " 1 QJ7 3 t ), 

tr(Q)<P 

the first term of which is identical to (II 131 l. Thus, it remains 
only to establish that 

max log det(I + F^F^ + F 32 F^ 2 + f^QO" f ) 



Q^O: 

tr(Q)<P 



-.. log dot- (I+^H r H»Hj j - of J ). (124) 



To obtain (11241 ), let 

7 = CT max (F 31 F^ + F 32 F|j 2 ) (125) 

denote the largest singular value of the matrix F 31 Fg 1 + 
F 32 Fg 2 . Since logdet(-) is increasing on the cone of positive 
semidefinite matrices, we have 

max logdet(I + F 3 iF^ + FsaF^ + fl 3 1 QCt^) 

tr(Q)<P 

< max logdet((l + 7)I + 3 " 1 Q^ t ) (126) 



Q^O: 

tr(Q)<P 



log dot ( (1 + 7 )I + -fig x «3 1 ) + o(l) (127) 



P 



= bgdct + — n 3 1 n 3 f j + o(i) 

= logdct ^1 + ^H r H»H^ + (1), 



(128) 



where (11261 ) follows from the fact that 7I - F 3 iF^ — 
F 3 2 F 32 y 0, and ( 11271 ) follows from the fact that water-filling 
provides a vanishingly small gain over flat power allocation 
when the channel matrix has a full rank (see, e.g., |31|), and 
( fT28b follows from JT07b . 



D. Analysis of the Masked MIMO Transmission Scheme 

To establish (|29l ), we focus on the two terms in the argument 
of ( fT9l ), obtaining 

J(u;y r ) = logdet(I+P t H r Hj) = logdet(I+P t A 2 ), (129) 

where we have used (|27| | to obtain the second equality, and 

7(u;y e )=/i(y e )-/i(y e |u) (130) 



with 



and 



h(y c ) =logdet(I + P t H c Hj) 



(131) 



h(y e \u) = logdet(I + F t H e V n VtHt) 

= logdet(I + P t H e (I- V r Vj)Ht) (132) 
= logdet(I + P t (I - V r Vj)HtH e ), (133) 

where to obtain (11321 ) we have used that V r Vj + V n VJ = I 
since [V r V n ] is unitary, and where to obtain d 1 33b we have 
used that det(I + AB) = det(I + BA) for any A and B of 
compatible dimensions. 

In turn, substituting ( 1131b and ( 11331 ) into ( 11301 ) we obtain, 
with some algebra, 

J(u;y e ) = -logdet(I-P t (I + P t HiH e )- 1 (V r V r t HtH e )) 
= - logdet(I - P t VtHtH e (I + PtH+He)-^,)) 
= -logdetCV^I + PtHtHe)"^,). (134) 



Finally, using dl29t and ( 1134b in the argument of ( fT9l ), and 
again using d2~7b . we obtain [cf. ( |29l )l 

Psn(P) 

= logdet(I + P t A 2 ) + logdet(Vj(I + P^He) -1 ^) 
= logdet(P t I + A -2 ) 

+ logdet(UAVj (I + PtHtH^V, AU f ) 
= logdet(P t I + A" 2 ) + logdet(H r (I + PtHtHe)" 1 ^), 

as required. 

Finally, to establish the first equality in d30b . we take the 
limit P t — > 00 in (|29l . In particular, we have 

Psn(P) 

= logdct(I + P t - 1 A- 2 ) 

+ logdetCH^Pt" 1 ! + HtHe)- 1 ^), 

= 0{P^) + logdetlH^^tHe)" 1 + OiP^nt) 

(135) 

= logdet(H r (HtH c )- 1 Ht) 

+ logdet(I + (HtH c )- 1 / 2 0(P t -i)( H tH c )-t/ 2 ) 
= logdetCH^HtHe)- 1 ^) + OiP,- 1 ), (136) 

where to obtain ( 11351 ) we have used that (el+M) -1 = M _1 + 
0(e) as e40 for any invertible M (32], and where we have 
also used that logdet(I + W) is continuous in the entries of 
W. 
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VII. MIMOME Channel Scaling Laws 

We first verify Claim [T] then use it to establish Corollary [2] 
Proof of Claim Q} Clearly, er max (H r ,H e ) = oo when 
[cf. d2ab l S r 7^ 0. Otherwise, it is known (see, e.g., [33 1) that 
Cmax(-) is the largest generalized singular value of (H r ,H c ) 
as defined in (|9]). 

To establish that the secrecy capacity is zero whenever 
cr max (H r ,H c ) < 1, it suffices to consider the high-SNR 
secrecy capacity ( |24] | when H c has full column-rank, which 
is clearly zero whenever <r max < 1. 

When <r max (H r , H ) > 1, there exists a vector v such that 
||H r v|| > ||H c v||. Then, choosing x = u - eK(0,Pvv f ) in 
the argument of (fT9b yields a strictly positive rate R(P), so 
C(P) > R{P) > for all P > 0. ■ 

Combining Claim Q] and Fact [T] below, which is established 
in EH p. 642], yields Corollary 

Fact 1 ( SMl, iU5l/ ): Suppose that H r and H c have i.i.d. 
SK(0, 1) entries. Let n r , n e , n t — > oo, while keeping n r /n e = 
7 and n t /n c = (3 fixed. Then if f3 < 1, 

^ 2 

i + ,/:i 



Cmax(H r , H c ) 



7 



(137) 



VIII. Concluding Remarks 

This paper resolve several open questions regarding secure 
transmission with multiple antennas. First, it establishes the 
existence of a computable expression for the secrecy capac- 
ity of the MIMOME channel. Second, it establishes that a 
Gaussian input distribution optimizes the secrecy capacity 
expression of Csiszar and Korner for the MIMOME channel, 
and thus that capacity is achieved by Gaussian wiretap codes. 
Third, it establishes the optimum covariance structure for the 
input, exploiting hidden convexity in the problem. Neverthe- 
less, many questions remain that are worth exploring. As one 
example, it remains to be determined whether such devel- 
opments based on Sato's bounding techniques be extended 
beyond sum-power constraints, as the channel enhancement 
based approach of 11221 can. 

In addition, our analysis highlights the useful role that the 
GSVD plays both in calculating the capacity of the MIMOME 
channel in the high-SNR regime, and in designing codes for 
approaching this capacity. At the same time, we observed 
that a simple, semi-blind masked MIMO scheme can be 
arbtrarily far from capacity. However, for the special case 
of the MISOME channel, (4| shows that the corresponding 
masked beamforming scheme achieve rates close to capacity 
at high SNR. Thus, it remains to be determined whether there 
are better and/or more natural generalizations of the masked 
beamforming scheme for the general MIMOME channel. This 
warrants further investigation. 

More generally, semi-blind schemes have the property that 
they require only partial knowledge of the channel to the 
eavesdropper. Much remains to be explored about what se- 
crecy rates are achievable with such partial information. One 



recent work in this area [5] illustrates the use of interference 
alignment techniques for the compound extension of the multi- 
antenna wiretap channel. Another recent work ll36ll . studies 
a constant-capacity compound wiretap channel model which 
again captures the constraint that the transmitter only knows 
the capacity (or an upper bound on the capacity) of the 
channel to the eavesdropper. Further insights may arise from 
considering other multiple eavesdropper scenarios with limited 
or no collusion. 

Finally, we characterize when an eaversdropper can prevent 
secure communication, i.e., drive the secrecy capacity to zero. 
Our scaling laws on antenna requirements and their optimal 
distribution in limit of many antennas provide convenient 
rules of thumb for system designers, as the results become 
independent of the channel matrices in this limit. However, 
it remains to quantify for what numbers of antennas these 
asymptotic results become meaningful predictors of system 
behavior. As such, this represents yet another useful direction 
for further research. 
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Appendix I 
Proof of Claim[2] 



To begin, 



(138) 



7(x;y r |y e )=7(x;U t 1 y r ,U 2 y r |ye) 

= I(x;y r) Uly r -Vjy e |y e ) 

= 7(x;y r ,Tx|y e ), (139) 



where ( 11381 follows from the fact that [Ui U2] is unitary, 
and where (11391 ) follows from substituting for y r and y c from 
O, using d45b . from and the fact that 



Ujz r a dJ- V\z c , 



(140) 



since 



cov(U t 1 z r , V\z c ) = £[ulz r ztVi] = uJ*Vi = I. 

Now when 7(x; y r |y c ) < 00, we have from ( 11391 ) that Tx = 0, 
so 7(x;y r |y e ) = 7(x;y r |y ), establishing (f38]l. 
Similarly, 

/(x; y c |y r ) = 7(x; V\y c , V 2 y e |y r ) (141) 
= 7(x;y ,V t 1 y c -U t 1 y I .|y r ) 
= 7(x;y e ,Tx|y r ), (142) 

where we have used that [Vi V2] is unitary to obtain 
(1141b and ( 1140b to obtain (1142b . When 7(x;y e |y r ) < 00, we 
have from (1142b that Tx = 0, so 7(x;y e |y r ) = 7(x;y e |y r ), 
establishing d4"Tb . 

To verify the "only if" statement of the last part of the 
claim, when 7(x;y r |y c ) = 00, we expand ( 1139b via the chain 
rule to obtain 

7(x; y r |y c ) = 7(x; y r |y c ) + 7(x; Tx|y r , y e ), (143) 
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and note that if Tx = then the second term on the right- 
hand side of J143I ) is zero. But the first term on the right-hand 
side is finite, so cov(Tx) 7^ 0, i.e., d44l i. holds. 

To verify the "if" statement of the last part of the claim, we 
use the chain rule to write 



^(x;yr|y ) = J(x;yr,Tx|y e ) 

>/(x;Tx|y c ) 
>7(x;Tx)-/(x;y e ), 



(144) 



and note that the first term in ( 1144b is infinite when cov(Tx) 7^ 
0, while the second term is finite. ■ 

Appendix II 

Optimizing R+(p x ,K^) OvERp x with Singular K# 

To establish that I(x; y r |y c ) with z ~ CN(0, K#) for singu- 
lar K3, is maximized subject to the constraint cov(x) = Kp 
when x is Gaussian (hence, justifying d50l ) in this case), we 
exploit Claim [2] 

In particular, if for all p x meeting the covariance constraint 
we have 7(x;y r |y c ) < 00, then we can use d38l) . expanding 
and bounding 7(x;y r |y ) in the same manner as (T53T) — (T54-b. 
with y r , z r , A = U2AU2 (the error covariance in the MMSE 
estimate of y r from y e ), and <& = [cf. d46l ll replacing 

y r , z r , A, and 3>, respectively. Specifically, we obtain that 



J(x;y r |y e ) = k(yr|y e ) - h{i T 



(145) 



is maximized when x is Gaussian. 

If, instead, there exists a p x satisfying the covariance con- 
straint such that 7(x;y, |y ) = 00, then by the "only if" part 
of the last statement of Claim [2] we have that (l44t holds. 
But by the "if" part of the same statement we know that 
7(x;y r |y ) = 00 for any p x such that d44l > holds, and in 
particular we may choose p x to be Gaussian. ■ 

Appendix III 
Proof of Lemma[3]for Singular K# 

We begin with the following: 

Claim 3: There exists a matrix H such that the combined 
channel matrix (@]i can be expressed in the form 



where 



H = WH, 



K* = WSW* 



(146) 



(147) 



is the compact singular value decomposition of K#, i.e., 
where W has orthogonal columns (W^W = I), and the 
diagonal matrix S has strictly positive diagonal entries. 
Hence, the column space of H is a subspace of the column 
space of W. 

Proof: We establish our result by contradiction. Suppose 
the claim were false. Then clearly 7(x; y r , y c ) = 00 when we 
choose x = tv where v £ Null(W) and var t > 0, which 
implies that 

i?+(K P ,K # ) = J(x;y r |y e ) = 7(x;y r ,y c ) - 7(x;y c ) = 00, 



since 7(x; y c ) < 00 as cov(z c ) = I is nonsingular. Hence, 

i?+(K P ,K*) = max R+(K P , K # ) = 00. (148) 

KpeXp 

But from (l48l l in Lemma |2] we know i?+(Kp,K*) < 00, 
which contradicts ( 1148b and hence (1146b must hold. ■ 

Using Claim [3] we see that in this case the original channel 
(fl~|l with cov(z) = K$ can be replaced with the equivalent 
combined channel 

y = Hx + z (149) 



where 



y 4 W f 



z4 W f z 



with cov(z) = S. Hence, we can write 

fi+(Kp,K$) = 7(x;y r ,y e ) - J(x;y e ), 

where 

det(H + HK P Ht) 



^(x;y r ,y ) = /(x;y) = log- 



det(H) 



and 



/(x;y c ) = logdct(I + H c KpHt). 



But from the saddle point property it follows that 
expressed as 

, det(H + HK P Ht) 

a= argmm log . 

{h : wswtGKj,} aet(c) 



(150) 

(151) 
can be 

(152) 



In turn, the KKT conditions associated with the optimization 
JT521 are 



(B + HKpH^ 1 = W'TW, 
or, equivalently, 

HKpH* = HW f TW(3 + HK P H f ), 



(153) 



where the dual variable T is of the same block diagonal form 
as in the nonsingular case, viz., d60l ). Multiplying the left- and 
right-hand sides of ( 11531 ) by W and W^, respectively, and 
using ( 1146b and ( 11471 ) we obtain d63l . Thus, the remainder of 
the proof uses the arguments following d63l in the proof for 
the nonsingular case to establish the desired result. ■ 

Appendix IV 
Proof of Proposition^ 

Consider first the right-hand side of dTTT i. Since h(y T — &y e ) 
is concave in Kp G Xp and differentiable over %p, the KKT 
conditions associated with the Lagrangian 

£®(Kp,A,*) 

= %r - ©yc) + tr(*K P ) - A(tr(K P ) - P) (154) 

are both necessary and sufficient, i.e., Kp is a solution to the 
right-hand side of ( |7Tb if and only if there exists a A > and 
* >z such that 

(H r - ©H ) t r(Kp,Kp)- 1 (H r - 0H O ) + * = AI, 
tr(*K P ) = 0, and A(tr(K P ) - P) = 0, 

(155) 
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where 



r(Kp,K P ) 

= cov(y r - 0y o ) 
= 1 + 06^ ®& 



+ (H r - 0H c )K P (H r - 0H c ) f . (156) 

Considering next the left-hand side of ( f7Tl ). to which Kp 
is a solution, we have, from the associated KKT conditions, 
that there exists A' > and *' >z such that 

v Kp My r - ®(Kp)yc)| Kp= K P + *' = AT 

tr(*'Kp)=0, and A'(tr(K P ) - P) = 0, 

where 0(Kp) is as defined in ( TTSb . 

Thus, it remains to show that ( 1155b and d 157b are identical 
when Kp = Kp. Focusing on the first equation in ( 11571 i. we 
have 



(157) 



lK P =K f 



V Kp /j(yr-0(Kp)y c 

- v K p/i(yr|yo)| Kp= K P 

= V Kp {h(y T ,y e ) - %e)}| Kp= K P 

= H t (HKpH t + K*) _1 H H*(I + H c K p h£) ^H c . 

(158) 

In turn, substituting for H and K$ from (HJl and ( 1211 , and 
using ( fr7l , the first matrix inverse in ( 1158b can be expressed 
in the form 



(HKpHt + K # ) 1 

I + H r KpH r t 
*t+H r KpHt 

A(Kp)- 1 



I 



-©tA(Kp)- 1 (I- 



+ H r K P Ht 

f- H c KpHt 

-A(K P ; 
H n K P H„) x + 



- 1 © 

©tA(Kp)- 1 © 

(159) 

where A(Kp) is as defined in d55l . and where we have used 
the matrix inversion lemma (see, e.g., [32]). Substituting (1159b 
into (1158b . and using the notation d55l . yields, after some 
simplification, 

V K p%r-0(Kp)y e )| Kp=gp 

= H^K* + HK P H f ) : H Hj(I + H KpH|) _1 H e 

= (H r -0H e )tA 1 (H r -@H e ). (160) 

Comparing ( 1160b with the first equation in J 155b . we see 
that it remains only to show that r(Kp,Kp) = A, which 
is verified as follows. First, ©y c is the MMSE estimate of 
y r from y when Kp = Kp, and T(Kp,Kp) = cov(y r — 
©yc) = cov(y r |y e ) is the error covariance associated with the 
estimate. But by definition [cf. (l55l l A = cov(y r |y c ) is also 
the error covariance associated with the MMSE estimate when 
Kp = Kp, so the conclusion follows. ■ 

Appendix V 
Proof of Lemma|4]for Singular K$ 

First, note that via d4Tb with d48l l. we have that 
fi + (Kp,K$) = J(x;y r |y e ) < oo for all K P g K P . Hence, 
via ( 1381 of Claim |2] we have 

i? + (Kp,K*) = /(x;y r |y ), VK P e X P , (161) 



with the equivalent observations y r as given by (1391 with d40b . 
Moreover, the noise cross-covariance * = U^* [cf. (|46l>] in 
the equivalent channel model has all its singular values strictly 
less than unity, i.e., the associated K^, is nonsingular. 

Thus, we can apply to this equivalent model the arguments 
of the proof of Lemma|4]for the nonsingular case. In particular, 
from d72l onwards we replace y r with y r , we replace 0(Kp) 
and with, respectively, [cf. (fT8l.([T7Til 

0(K P ) 4 (H r KpH| + #)(I + HeKpHt)- 1 = U^0(K P ) 

(162) 

and 

(163) 



= 0(K P ) = UJ,0, 

which is the coefficient in the MMSE estimate of y r from y c , 
and we replace H e g and J with, respectively, [cf. ( l77b l 



and 



Hoff" J- 1/2 (H r -0H c ) 



J = (I — + (0 - *)(© - *) f = UpU; 



noting that J >- since K|, >- 0. With these changes, and 
with the SVD 

H cf f = ASeffBt 

replacing ( f78l , the arguments apply and it follows that (H r — 
®H )S has a full column rank. Since 

(H r 0H c )S = l4(H r - 0H o )S 

it then follows that (H r — 0H c )S has a full column rank. ■ 

Appendix VI 
Proof of Lemma[5]for Singular K* 

Consider first the case in which H r ^ 0H C , and note that 

i? + (Kp, K*) - R-{K P ) = J(x; y c |y r ) < oo, 



where the equality is reproduced from ( 1851 ), and where the 
inequality follows from (l48T > and that i?_(Kp) > 0. Hence, 
applying (|4TT ) from Claim |2] we have 

fl+(Kp,K*)-i?_(Kp)=J(x;y e |y r ) ) 

with the equivalent observations y as given by d42b with d43l . 
Moreover, the noise cross-covariance 



(164) 



in the equivalent channel model has all its singular values 
strictly less than unity, i.e., the associated K^, is nonsingular. 

Thus, we can apply to this equivalent model the correspond- 
ing arguments of the proof of Lemma [5] for the nonsingular 
case. In particular, from d85l onwards we replace y c and z c 
with, respectively, y c and z c , we replace Ab with 

Ab = 
I + H c K P Hj 

- (*t + H e K P Ht)(I + H r K P Ht)- x (* + H r K P Ht)) 
= V^A b V 2 , 
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which is the backward error covariance associated with the 
linear MMSE estimate of y c from y r , and we replace the use of 
( l22t in ( f87l > and ( f88l > with its form for the equivalent channel, 
viz., for all full column-rank S such that SS^ = Kp, 

^HrS = V^HrS = vJ,H c S = H e S, 

where to obtain the first equality we have used ( 11641 ), where 
to obtain the second equality we have used Property Q] and 
where to obtain the third equality we have used ( |43T >. 

Finally, consider the case in which H r = 0H O . Since (l48b 
holds, so does d38l l of Claim [2] and thus 

R+(K P ,K*) =7(x;y r |y e ), (165) 

with the equivalent observations y r as given by (|39l > with d40b - 
Thus, we can apply to this equivalent model the correspond- 
ing arguments of the proof of Lemma [5] for the nonsingular 
case. In particular (and as in Appendix [V}, from d90l ) onwards 
d 1 65b implies we replace y r and z r with, respectively, y r and 
z r , we replace 3> with [cf. d46b l <& = Uj*, the coefficient 
in the MMSE estimate of z r from z e , and we replace with 
[cf. <fl6l.<fl63ll 

= (H r KpH* + *)(I + HcKpH^)- 1 = U^0, (166) 

the coefficient in the MMSE estimate of y r from y . 

Note that in obtaining the counterpart of d92j > we use that 
y r 0y c = z r 0z o since 

H r = u3,H r = UJ,0H C = 0H O , (167) 

where the first equality follows from d40l i, the second equality 
follows from the assumption H r = 0H C , and the third 
equality from ( 11661 ). Moreover, in obtaining the counterpart 
of d93l l we use that = $ when ( 1167b holds. ■ 
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